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ASSEMBLER SET 



The present invention relates to , an assembler for a 
microprocessor and to a method of preparing a program for 
execution on a microprocessor. 

During the preparation of a binary machine language program for 
execution on a microprocessor a number of operations are 
effected. Such operations include translation of a 

representation of the program in a source language into the 
binary machine language, binding symbols to addresses, and 
debugging the program. The process of translation may be 
accomplished by a compiler which receives as its inputs a high 
level language representation of the program, or an assembler 
which receives as its inputs an assembly language representation 
of the program . 

A problem exists in the preparation of programs for 
microprocessors which are still in the process of development in 
that the architecture of the instruction set may alter during 
development of the microprocessor. Such changes may take the 
form of altering the size and location of instruction operands or 
the writing of new instructions. 

Assemblers are typically programs which translate instructions 
comprising mnemonics and operands into binary representations and 
which translate the mnemonics and operands (including immediate 
data) into corresponding binary values, viz opcodes and encoded 
operands. Each instruction comprises a set of contiguous bit 
fields which fully characterize the instruction, where a bit 
field is a sequence of contiguous bits. 

The assembler must ensure for example that the binary 
representation of operands are located in the correct bit field 
and this is typically achieved by hard coding of control 
information into the assembler program. Likewise, if encoding a 
particular operand involves a specified operation to encode it, 
for example, division by 2, this operation is typically hard 
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coded into the assembler. 

the instruction set architecture is 
A problem which arises if the s hard coded into the 

changed is that re-wri ^1 be necessary to account for 
assembler, amongst other things, will 

the changes. 

.•ect of the present invention to at least partially 
It is an object or tne y 

overcome the difficulties of the prior art. 

^ r.f the present invention there is 
According to a first aspe °V ^ JLoprocessor. the assembler 
provided an assembler for * "^HLnslation device, the 
comprising a descriptor file a f descr iptive of the 

descriptor file Accessor and the translation 

instruction set of sax* / e into TO chine language 

device in use translating Y comprises a fetching 

a s an output, -herein^e — descriptor file and a control 
device for acquiring data fro ^ devlce and 

r^sc^ - - -r devics to con£OTm to 

,., orme of said instructxon set. 
the architecture oj- 

v. is further provided a data capture device having 
Preferably there is further _p get of said target 

an input for ^••^ to lM ^ whe rein said output 

microprocessor and having an output, 
comprises said descriptor file- 

i«. further provided a linker, and said 
Preferably again ^\^T*J~ outputting selected data 
assembler has a data transfer said linker , whereby said 

jr si^rn::: ~ - — - 

said assembler. 

t of the invention, there is provided 

rrhoToT^rrirnachine ^ ^ - 

microprocessor comprising:- 



providing a descriptor file containing information 
descriptive of the instruction set of said target microprocessor; 



translating assembly language instructions into machine 
language, wherein the translating step comprises 

acquiring data from said descriptor file; and 

constraining the machine language to conform to the 
architecture of said instruction set by using said data. 

Preferably the step of providing a descriptor file comprises : 

capturing data from the instruction set of said target 
microprocessor; 

and the step of translating assembly language instructions 
comprises providing assembly language instructions for said 
target microprocessor . 

Preferably again the step of providing assembly language 
instructions comprises : 

providing plural program modules, at least one of said 
modules having one or more instructions including external 
symbols, wherein external symbols have values which cannot be 
determined without reference to another program module; 
and the method further comprises binding said external symbols to 
addresses using data selected from said descriptor file. 

An embodiment of the present invention will now be described, by 
way of example only, with reference to the accompanying drawings 
in which : - 

Figure 1 shows a block diagram indicating the context of the 
present invention . 



<5 



4 



Fig ure 2 is a bloc, diagram showing an exemplary embodiment to 

the invention. 

Figu re 3 snows an exemplary instruction from the instruction set 
of Figure 2 . 

• ^c^Hntion file shovm in Figure 2 

Figure 4 shows an entry in the description 

corresponding to the instruction of Figure 3 . 

Figure 5 shows a. second exemplary instruction of the instruction 
set of Figure 2 and; 

Fig ure « snows a second entry in the description file of Pigure 2 
corresponding to the instruction of Figure 5 . 

Eve ry type of microproc esso r has it- a- "IlreX 

s r :r rr :::r rr r i t 

«oc:ssr T hen the program = - ^ ^ ^ 
into machine language using a translation 

compiler. 

• no n^ressarv to write programs 

ra^rirructLns in — — = 
is, like machine language unique to each typ ^ 
but instead of being ■ wrrtten rn "^^^ / ocessor opcodes , 
commands each corresponding to one o the mr J ^ ^ 

together with ^ typically the 

used to make a symbolic reference writin g a 

address of some named location m memory _ M« 

program module in assembly language, the 

translated using an assembler into machrne language. 



Instructions consist of a number of bit fields each representing 
different information required to carry out an operation. Such 
fields include opcodes, operands and fields reserved for 
architecture use, the operands including register designators and 
immediate data. For any one microprocessor it is possible for 
instructions to have different formats appropriate to the 
operation being performed. Thus, one opcode may require two 
operands whereas another opcode may merely require a single 
operand. Furthermore the size of a bit field available for the 
operand is likely to vary depending on the format which in turn 
depends upon the nature of the operand - for example a register 
identifier may be very much smaller than an instruction 
displacement . 

In Figure 1 a first source code module 1 written in assembly 
language is input to assembler 2 to provide an object code module 
3 which is in machine language and is directly analogous to the 
assembly language source code module. A second source code 
module 11, in assembly language is input to an assembler 12 to 
provide an object code module 13. It will of course be clear to 
those skilled in the art that more than two source code modules 
may be provided and that the same assembler could be used for 
each of a plurality of source code modules, the modules being 
assembled sequentially. 

The source code modules provide an input to a linker 4 which may 
also receive an input from an object code library 6. The 
function of the linker includes binding those operands which are 
external symbols to addresses so that the object code modules 
cooperate together to form executable machine code . Some code 
modules from the library will be required to effect this, in the 
case that the symbolic references are to objects in the library. 
The linker thus performs the function of a link editor. 



In the prior art , as mentioned above, the assembler which is 



retires hard-coded ^^"^r^,. code eo^ivalent. 

bits to the bxt tiei instruction is to write 

de termine which register, whereas if ^ the ^ ^ 

to memory, a much larger tat field t P ^ ^ 

to be encoded may ^^"^ ^' to speci£ y both the size 
programmer of f'^ZZ * each instruction, indeed by, 
and location o ^ each ^bit fiel ^ in£orraati on into the 

for example, the opc acceptable where the instruction 

assembler. Although this may a r 

f. au se severe 

„ ie fixed and constant, it can 
set architecture is fixed a mic roprocessors 
difficulties when a microprocessor or series 

is under development. 

w - 2 6 an embodiment of the invention will be 

Referring to Figures 2-6, an em* - s reducedf thus 

described in which hard-coding of the assemb instruc tion set 
allowing an assembler to trac k changes in the^ ^ ^ ^ 
more readily- In a preferred embodiment 

occur . 

automatically track changes in the instruction 

4n v-Loure 2 has some similarities to that 
T he arrangement shown ^ » ^ ^ „ uhich is 

of : ig r: ^J^^ * °- pucs ° bject code co a linke a 

applied to an assembler modules, shared 

. a , a nf a translation device 21 which 
The assembler 20 consists of a t* ^ 

t-o the source code -lu ^ u f- 1 - 
responds directly to the cranslit eration of the source 

output 22 which represents a re " ^ ^ ^ 

code. The source code is \ „ ich in£orna tion 

function is to address a ^^J^ being tran slated to 
derived from the source code currently 
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provide output information 25 representative of constraints due 
to the instruction set architecture. The output information 25 
is applied to a control device 26 which operates on the 
translated information 22 to constrain that information to 
conform with the requirements of the instruction set. The output 
information 25 is also applied to a data transfer device 27 whose 
output is combined with the constrained translated information to 
provide an assembler output 28. 

Generally speaking, the data conforming to the source code is 
provided from the control device 2 6 whereas the data transfer 
device 27 provides the information directly to the linker 40. 
This enables the linker to perform operations on the data, again 
determined by the instruction set. If for example the 

instruction requires scaling to be effected, this can be achieved 
by providing the relevant scaling factor via the data transfer 
device to the linker. This happens for example when the scaling 
is to be applied to an external symbol - i.e. a symbol whose 
value is known only at" link time. 

In this embodiment, the descriptor file is derived manually from 
manipulation and inspection of the instruction set 3 0 but in a 
preferred embodiment, a utility program 31 is used to access 
instruction set architecture data 3 0 to provide the descriptor 
file 24 . 

Figures 3 and 5 show two exemplary instructions, each of which 
are in fact 3 2 bits long, although fewer than this number of bits 
is shown. In Figure 3 the instruction includes an opcode A which 
requires two source registers RSI and RS2 as well as a 
destination register (not shown) . In this instruction, the first 
source register has a starting bit position of zero and a 
finishing bit position of 5 and the second source register has a 
starting bit position of 14 and a finishing bit position of 19. 
The bits between bits 5 and 14 are either empty or serve a 
function such as indicating an instruction format. 
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o * ' ft to Fioure 5 a second instruction has an opcode of B and 
Referring to Figure starting at bit 

„ ____ reau i res immediate data hujx 
in this case requi - tion 7f and a destination 

position 0 and running to bit positi 

agister whose numeric identity ^ of the 

Xt will be seen ^ instruction 

output of the assembler ^reguires gn^ o£ & 

^ ^,1^1 - number of bits available for that 

bit field. 

^t-abase comprises information about the bit 
The instruction set database comp decided to 

fields of each instruction. For instance ~ y J^ 0 use a 
scale (divide) by two on an address ^"^^ \ ncodLng 
smaller bit field. This ^^i^ S^^ 
function) is used by the assembler and is passed 
for use with external symbols. 

■^na the decoding function (e.g. multiply by two, 
By further providing the can be checked f or 

in the example above,) a user s source « 

v of specifying an unencodeable value, such as 
the error of specifyi g er whose number is out of 

address in this example, or a regisu 

range . 

instruction set m the descnp p.™,™. 4 and 6 . In 

. f is diagrammatically shown m Figures 

part of whrch rs drag repre sents an identifier for 

each of these rxgures, the top Ixn _ ^ ^ ^ ^ „ 2A „ 

the operand of concern, m Figure « * ^ and „ DB „ 

represents ^ ^ssociLed with the reievant 

represents BD1. each of ^ starCing blt 

biC . £iSld £ s^ratedterand, "d the third Xine shows the 

rreXrof the oTerand. T he fourth iine shows the end b it of 

each operand. 



For each instruction the syntax, eg the mnemonic and its 
equivalent is derived and stored and for each bit field in the 
instruction the type of information of each bit field is derived 
and stored. For instance "12" for a particular opcode, all cp ' s 
for a reserved field, values 0-15 for a register, address/4 for a 
symbolic value. 

The assembler then accesses the stored information to enable it 
to accept assembly language instructions, from for example source 
code modules, and encode those instructions to provide a 
translated output in machine language. 

As an example, for a microprocessor having a set of 16 registers, 
it is decided to define an instruction for loading a value from 
external memory to a specified one of the set of registers. Such 
an instruction is given the mnemonic MOV and the particular 
register is defined as Rn, with n=0-15. It is further decided 
that bits 8-12 will contain the pattern 10110 for this 
instruction in a 2 byte instruction length. The address is to be 
encoded in bits 1-7 and the register number is specified in bits 
13-16 . 

This data is stored as the descriptor file. When supplied with 
user source code of the form: 

MOV fred, R12 ("fred" being an external symbol) 
the assembler accesses the descriptor file to enable it to 
transliterate the assembly language into machine language, 
filling in the opcode and register. Moreover the necessary data 
are supplied from the descriptor file to enable the linker how 
to patch in "fred". 
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Claims 



1 An assembler for a target microprocessor, the assembler 

t^T^~ *^ — la^e i = chine 
language as an output W herein the translation ^^Tle ana 
fetching device for acquiring data from said descriptor file and 
reccnxiiy fetching device 

to the architecture o£ said instruction set. 

, An assembler as claimed in claim 1 wherein the descriptor 
file rur^omprises syntax information for each instruction 
aid the translation device translates each instruction on the 
basis of said syntax information. 

3 a device for preparing a program executable on a target 
or- 2 and further comprising a data captuxe 

for accessing the instruction set of said target microprocessor 

I:: ::::: g ™ ~— - saia 

descriptor file. 
« 

4 A device for preparing a program executable on a target 
' sor said device comprising a linker and further 
microprocessor, said aevi«- * wherein said 

• • aT , assembler in accordance claim 1 or 2, wnerein 

rerr;rr:ata — - — — 

said assembler. 

5. A method of assembling a machine language program for a 
target microprocessor comprising:- 
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providing a descriptor file containing information 
descriptive of the instruction set of said target microprocessor; 

translating assembly language instructions into machine 
language wherein the translation step comprises 

acquiring data from said descriptor file; and 

constraining the machine language to conform to the 
architecture of said instruction set. 

6. A method as claimed in claim 5 wherein said descriptor file 
further contains syntax information for each possible instruction 
of the instruction set, and said translating step comprises 
transliterating each assembly language instruction using said 
syntax information. 

7 . A method of preparing a program executable on a target 
microprocessor comprising: 

capturing data from the instruction set of said target 
microprocessor thereby forming a descriptor file containing 
information descriptive of said instruction set; 

providing assembly language instructions for said target 
microprocessor ; 

translating each assembly language instruction into a 
corresponding machine language output ; and 

using data from said descriptor file, constraining the 
machine language output to conform to the architecture of said 
instruction set. 



8 . A method of preparing a program executable on a 
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microprocessor, comprising: 

providing plural program modules, at least one of said 
modules having one or more instructions including external 
symbols, wherein external symbols have values which cannot be 
determined without reference to another program module; 

providing a descriptor file containing information, 
descriptive of the instruction set of said target microprocessor; 

translating assembly language instructions into machine 
language wherein the translation step comprises 

acquiring data from said descriptor file; 

constraining the machine language to conform to the 
architecture of said instruction set ; 

and further comprising binding external sycnbols to addresses 
using data selected from said descriptor file. 
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ABSTRACT 
ASSEMBLER DATA 

An assembler for a microprocessor has a file which contains 
data describing the instruction set of the microprocessor. A 
translation device for translating into machine language accesses 
the instruction set descriptors to constrain the machine code 
output of the assembler to conform to the architecture of the 
instruction set . 
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